Voicing parameter and energy based speech/non-speech detection for speech recognition in adverse conditions

نویسندگان

  • Arnaud Martin
  • Laurent Mauuary
چکیده

In adverse conditions, the speech recognition performance decreases in part due to imperfect speech/non-speech detection. In this paper, a new combination of voicing parameter and energy for speech/non-speech detection is described. This combination avoids especially the noise detections in real life very noisy environments and provides better performance for continuous speech recognition. This new speech/non-speech detection approach outperforms both noise statistical based [1] and Linear Discriminate Analysis (LDA) based [2] criteria in noisy environments and for continuous speech recognition applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust speech/non-speech detection based on LDA-derived parameter and voicing parameter for speech recognition in noisy environments

Every speech recognition system contains a speech/non-speech detection stage. Detected speech sequences are only passed through the speech recognition stage later on. In a very noisy environment, the noise detection stage is generally responsible for most of the recognition errors. Indeed, many detected noisy periods can be recognized as a vocabulary word. This manuscript provides solutions to ...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Real time robust speech detection for text independent speaker recognition

Speaker recognition systems employ a speech detection algorithm and use only frames detected as speech for further processing. The accuracy obtained by a speaker recognition system depends on the method that is used to detect speech, in particular for real-life deployments where the incoming speech varies significantly in loudness and noise characteristics. Also, actual deployments mandate real...

متن کامل

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003